A semimartingale characterization of average optimal stationary policies for Markov decision processes
نویسندگان
چکیده
منابع مشابه
A Semimartingale Characterization Ofaverage Optimal Stationary Policies Formarkov Decision Processes
This paper deals with discrete-time Markov decision processes with Borel state and action spaces. The criterion to be minimized is the average expected costs, and the costs may have neither upper nor lower bounds. In our former paper (to appear in Journal of Applied Probability), weaker conditions are proposed to ensure the existence of average optimal stationary policies. In this paper, we fur...
متن کاملEfficient Policies for Stationary Possibilistic Markov Decision Processes
Possibilistic Markov Decision Processes offer a compact and tractable way to represent and solve problems of sequential decision under qualitative uncertainty. Even though appealing for its ability to handle qualitative problems, this model suffers from the drowning effect that is inherent to possibilistic decision theory. The present paper proposes to escape the drowning effect by extending to...
متن کاملSample-Path Optimal Stationary Policies in Stable Markov Decision Chains with the Average Reward Criterion
Abstract. This work concerns discrete-time Markov decision chains with denumerable state and compact action sets. Besides standard continuity requirements, the main assumption on the model is that it admits a Lyapunov function `. In this context the average reward criterion is analyzed from the sample-path point of view. The main conclusion is that, if the expected average reward associated to ...
متن کاملSplitting Randomized Stationary Policies in Total-Reward Markov Decision Processes
This paper studies a discrete-time total-reward Markov decision process (MDP) with a given initial state distribution. A (randomized) stationary policy can be split on a given set of states if the occupancy measure of this policy can be expressed as a convex combination of the occupancy measures of stationary policies, each selecting deterministic actions on the given set and coinciding with th...
متن کاملQuantized Stationary Control Policies in Markov Decision Processes
For a large class of Markov Decision Processes, stationary (possibly randomized) policies are globally optimal. However, in Borel state and action spaces, the computation and implementation of even such stationary policies are known to be prohibitive. In addition, networked control applications require remote controllers to transmit action commands to an actuator with low information rate. Thes...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Applied Mathematics and Stochastic Analysis
سال: 2006
ISSN: 1048-9533,1687-2177
DOI: 10.1155/jamsa/2006/81593